SSUnique: Detecting Sequence Novelty in Microbiome Surveys

نویسندگان

  • Michael D J Lynch
  • Josh D Neufeld
چکیده

High-throughput sequencing of small-subunit (SSU) rRNA genes has revolutionized understanding of microbial communities and facilitated investigations into ecological dynamics at unprecedented scales. Such extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environmental or host-associated samples, often contain a substantial proportion of unclassified sequences, many representing organisms with novel taxonomy (taxonomic "blind spots") and potentially unique ecology. Indeed, these novel taxonomic lineages are associated with so-called microbial "dark matter," which is the genomic potential of these lineages. Unfortunately, characterization beyond "unclassified" is challenging due to relatively short read lengths and large data set sizes. Here we demonstrate how mining of phylogenetically novel sequences from microbial ecosystems can be automated using SSUnique, a software pipeline that filters unclassified and/or rare operational taxonomic units (OTUs) from 16S rRNA gene sequence libraries by screening against consensus structural models for SSU rRNA. Phylogenetic position is inferred against a reference data set, and additional characterization of novel clades is also included, such as targeted probe/primer design and mining of assembled metagenomes for genomic context. We show how SSUnique reproduced a previous analysis of phylogenetic novelty from an Arctic tundra soil and demonstrate the recovery of highly novel clades from data sets associated with both the Earth Microbiome Project (EMP) and Human Microbiome Project (HMP). We anticipate that SSUnique will add to the expanding computational toolbox supporting high-throughput sequencing approaches for the study of microbial ecology and phylogeny. IMPORTANCE Extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environmental or host-associated samples, often contain many unclassified sequences, many representing organisms with novel taxonomy (taxonomic "blind spots") and potentially unique ecology. This novelty is poorly explored in standard workflows, which narrows the breadth and discovery potential of such studies. Here we present the SSUnique analysis pipeline, which will promote the exploration of unclassified diversity in microbiome research and, importantly, enable the discovery of substantial novel taxonomic lineages through the analysis of a large variety of existing data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cognitively Motivated Novelty Detection in Video Data Streams

Automatically detecting novel events in video data streams is an extremely challenging task. In recent years, machine-based parametric learning systems have been quite successful in exhaustively capturing novelty in video if the novelty filters are well-defined in constrained environments. Some important questions however remain: How close are such systems to human perception? Can results deriv...

متن کامل

Satellite thermal surveys to detecting hidden active faults and fault termination, Case study of Quchan fault, NE Iran

The Quchan fault is located in Quchan - Shirvan area which is a part of Chenaran- Bojnourd plain in Kopeh-Dagh zone, NE Iran. The Quchan active fault with northwest – southeast trending is one of the most important strike-slip faults in the area which its activity led to the numerous historical and instrumental earthquakes. The Neo-tectonic activities of this fault are investigated by the drain...

متن کامل

Using P300 to Evaluate the Effect of Object Color Knowledge in Novelty Detection

A B S T R A C T Introduction: In an oddball experiment, the context in which novel stimuli are presented affects characteristics of novelty P3, i.e. as long as there is a difficult task in which the difference between standard and target stimuli is small, recurrent presentation of a highly discrepant stimulus can lead to P300 highly similar to novelty P3. Effect of stimulus properties on P300 h...

متن کامل

Relational Frequent Patterns Mining for Novelty Detection from Data Streams

We face the problem of novelty detection from stream data, that is, the identification of new or unknown situations in an ordered sequence of objects which arrive on-line, at consecutive time points. We extend previous solutions by considering the case of objects modeled by multiple database relations. Frequent relational patterns are efficiently extracted at each time point, and a time window ...

متن کامل

Sediment microbiomes associated with critical habitat of the Juvenile American Horseshoe Crab; Limulus polyphemus

Plumb Beach, Brooklyn, New York in USA is an important horseshoe crab breeding and nursery ground that has experienced substantial anthropogenic influence, including pollution, erosion and subsequent restoration. Since little is known about the relationship between sediment microbial communities and juvenile horseshoe crab survival, next generation sequencing was used to characterize and compar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2016